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The Bekko Tombo, Libellula angelina Selys, 1883 (Odonata: Libellulidae), is listed as an endangered 
species in South Korea, and is classified as a critically endangered species by the International Union 
for Conservation of Nature (IUCN). An assessment of the genetic diversity and population relationships 
of the species by molecular markers can provide the information necessary to establish effective con- 
servation strategies. In this study, we developed 10 microsatellite markers specific to L. angelina using 
the Illumina NextSeq 500 platform. Forty-three samples of L. angelina collected from three localities 
in South Korea were genotyped to validate these markers and to preliminarily assess the population 
genetic characteristics. The 10 markers revealed 4—11 alleles, 0.211—0.950 observed heterozygosity (Ho). 
and 0.659—0.871 expected heterozygosity (Hg) in the population with the largest sample size (n = 20), 
thereby validating the suitability of these markers for population analyses. Our preliminary assessment 
of the population genetic characteristics appears to indicate the following: presence of inbreeding in all 
populations, an isolation of the most geographically distant population (Seocheon), and a lower Ho than 
Hg. The microsatellite markers developed in this study will be useful for studying the population genetics 
of L. angelina collected from additional sites in South Korea and from other regions. 


Keywords: dragonfly; SSR; Illumina paired-end sequencing; endangered species; IUCN; population 
genetics 


Introduction 


The Bekko Tombo (Libellula angelina Selys, 1883; Odonata: Libellulidae), distributed through- 
out northern China, Japan, and Korea (Inoue, 2004; Jung, 2012), is listed as an endangered 
species in South Korea. It occurs in several areas in the western region of South Korea (Figure 1; 
e.g. Incheon, Gwangmyeong, Ansan, Yongin, Anseong, Seocheon, Gimje, and Jeonju); how- 
ever, because of the reduction in population and extinction, there are only a limited number of 
stable populations of L. angelina at present (Shin et al., 2012). Natural ponds and swamps rich in 
organic matter are the main habitats for Bekko Tombo species in South Korea (National Institute 
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Figure 1. The distribution and sampling localities of Libellula angelina in South Korea, with the pairwise results 
of genetic distance (Ет) іп a geographical context. The distribution localities are marked in dark grey, and sampling 
localities are marked in green. General locality names are as follows: 1, Incheon Metropolitan City; 2, Gwangmyoung 
City, Gyeonggi-do Province; 3, Ansan City, Gyeonggi-do Province; 4, Yongin City, Gyeonggi-do Province; 5, Anseong 
City, Gyeonggi-do Province; 6, Seocheon City, Chungcheongnam-do Province; 7, Gimje City, Jeollabuk-do Province; 
and 8, Jeonju City, Jeollabuk-do Province. This map was acquired from the Korea National Spatial Data Infrastructure 
Portal. The asterisk indicates statistical significance, (p < 0.05). 


of Biological Resources, 2013), but such habitats are declining because of urban development 
and expansion, which are accompanied by reclamation and contamination. 

At the international level, L. angelina has been classified as critically endangered since 1986 
by the International Union for Conservation of Nature (IUCN). According to Inoue (2004), who 
designated the species as endangered in Japan, the prime habitats necessary for the maintenance 
of sustainable L. angelina populations in the country are old and stable ponds, with moderate 
growth of emergent vegetation and an area of open water, in lowland hill areas. 

Populations decline, however, as a consequence of anthropic developments that destroy and 
degrade such habitats, besides introducing predators that threaten native species (Inoue, 2006). 
Furthermore, several dragonfly species including L. angelina, which were once common in 
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lentic habitats, are now reported as endangered, because of rapid and extensive changes and 
degradation of agricultural habitats in Japan (Kadoya, Suda, & Washitani, 2009). 

Estimation of population genetic characteristics such as population structures, genetic diver- 
sity, and genetic isolation is important for establishing a conservation strategy for endangered 
species (Moritz, 2002; Palsbgll, Berube, & Allendorf, 2007). The genetic diversity and differ- 
entiation of L. angelina in the Okegayanuma area of Japan was previously determined using 
random amplified polymorphic DNA analyses; the species was found to present low genetic 
diversity among and within populations and exhibited no significant genetic difference among 
populations (Takahashi, Fukcui, & Tubaki, 2009). However, the South Korean populations of 
L. angelina have never been subjected to population genetic analyses. 

In this study, we developed 19 new microsatellite markers from L. angelina. To our knowl- 
edge, the present study is the first of its type for this species and other congeneric species. Given 
the limited access to this endangered species and its rarity, population genetic analyses were 
performed based on the examination of a limited number of samples from three South Korean 
localities, and the validity of the markers was tested using the largest population. 


Materials and methods 


Sampling and DNA extraction 


Adult L. angelina were sampled from three localities in the western region of South Korea in June 
2016 (Figure 1). Seocheon (36.027328°N, 126.717994°E; n = 13) is the southernmost collection 
locality, and is situated approximately 135 km south of Ansan (37.272664°N, 126.581689°E; 
n = 20) and approximately 160km south of Gwangmyoung (37.458503°N, 126.869561°E; 
n = 10). Gwangmyoung and Ansan are located approximately 35 km from each other. The latter 
locality is a small offshore islet (34.39 km? in area and less than 1 km from the nearest main- 
land in South Korea). For each location, we obtained the necessary collection permits from the 
respective authorized environmental offices. 

Total DNA was extracted from the hind legs of the collected L. angelina using a Wiz- 
ard Genomic DNA Purification Kit (Promega, Madison, WI, USA) as per the manufac- 
turer’s instructions and was stored at —20°C until use. Voucher specimens were deposited 
at the National Institute of Biological Resources, Incheon, South Korea (accession numbers 
NIBRGRO0000412973—-NIBRGR0000413014, NIBRGRO0000413016). 


Development of microsatellite markers and genotyping 


For the construction of a DNA library, one specimen collected from Ansan was used. DNA 
quality and concentration were assessed using a NanoDrop spectrophotometer (NanoDrop Tech- 
nologies, Wilmington, DE, USA). Genomic DNA (200ng) was sheared into fragments of 
approximately 550 bp using a Covaris S220 ultrasonicator (Covaris, Woburn, MA, USA) and 
then processed to produce an Illumina paired-end library using a TruSeq Nano DNA Library Kit 
(Illumina, San Diego, CA, USA). Size and concentration of the prepared library were confirmed 
using an Agilent 2100 Bioanalyzer system (Agilent Technologies, Santa Clara, CA, USA) and a 
quantitative PCR-based KAPA library quantification kit (KAPA Biosystems, Wilmington, MA, 
USA), respectively. The library was sequenced on an Illumina NextSeq 500 system (San Diego, 
CA, USA) using 150 base-length read chemistry in a paired-end mode. 

For assembly, sequencing errors were discarded using the error correction module of 
ALLPATHS-LG (Gnerre et al., 2011). Then, genome assembly was performed using IDBA UD 
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Table 1. Characteristics of 10 polymorphic microsatellite loci developed for L. angelina. 





Annealing GenBank 
temperature Size accession 

Marker Motifs Primer sequence (5'-3/) (°С) (bp n a Availability? no. 

ГА13449 (АТТ) Е: ACAGGATTATGTATGACCTA 53 214 43 18 1.000 MG321252 
R: ATTTCAACTATTTCCGAGAA 

LA70081 (AG)o Е: AAACGTTTGTTACATTGAAA 53 189 43 6 0.930 MG321253 
R: TTGTCGAAATAAATACTCGA 

ГА18063 (САТ); Е: GTGGTTTATGCCATTTTAAA 53 198 43 12 0.953 MG321254 
R: CATCACCATTGATTTTCAAA 

LA3937 (ААС); Е: GAAGATCTGAGTTAAAACCT 53 234 43 14 1.000 MG321255 
R: AGAGCTCCTAAACTATTACT 

LA47390 (CT); Е: АСССССААААТТАААТАААА 53 217 43 10 0.977 MG321256 
К: СТТТСАТАСССТАТСАСТТА 

LA27 (ATT); Е: TTAATCAAAGGGTTAGATGG 53 244 43 14 0.977 MG321257 
R: TTACCGAAGAGATTTCAAAT 

LA747 (ААТ) Е: GAGATTGATATTTAGGTCCC 53 84 43 18 1.000 MG321258 
R: TTGATAGTCTTATTTTGGCA 

LA1398 (Аб) Е: AATGGCACTTAATAAGGAAT 23 229 43 9 0.977 MG321259 
R: CTCAAGATATCAATGACTGT 

LA99984 (СТ) Е: AAGAAAGGAGAGTATGTTTC 53 165 43 16 1.000 MG321260 
R: GCAATTAGTAAAACGTAACC 

LA315 (ААТ); Е: AAAACCTTAGGCAAATGATA 53 187 43 8 1.000 MG321261 





R: CCTTGAATAAAACAGTGAAG 


n, number of tested individuals; and a, number of observed alleles. “Availability is defined as 1 — Obs/n, where Obs is the number of 
observations. 


(Peng, Leung, Yiu, & Chin, 2012) with the pre-correction option. To identify reliable assem- 
blies, short reads were remapped to assembled sequences using Bowtie2 (Langmead & Salzberg, 
2012); only assembled scaffolds with a depth of > 10 х and coverage > 95% were retained for 
microsatellite marker identification. 

For genotyping, one primer of each selected pair (Gencube, Gimpo, Korea) was labeled with 
6-carboxyfluorescein (6-FAM) fluorescent dye (Yue, Chen, & Orban, 2000). Each PCR reac- 
tion mixture contained 30 ng of DNA, 1 x PCR buffer [50 mM KCI, 10 mM Tris-HCl (pH 8.8), 
1.5 mM MgCl], 2.5 mM dNTPs, 200 nM of each primer, and | unit of Pure Speed PFU DNA 
polymerase (Smarteome, Suwon, Korea); the final reaction volume was 25 ul. PCR was con- 
ducted on an ABI 2720 Thermocycler (Applied Biosystems, Carlsbad, CA, USA) using the 
following conditions: an initial denaturation step at 95?C for 3 min, a 30-cycle amplification 
(94°C for 30 s, 53°C for 30 s, and 72°C for 1 min), and a final extension step at 72°C for 
5 min (Table 1). Aliquots (0.2 ul) of the amplified PCR products were mixed with 9.8 ul Hi- 
Di Formamide (Applied Biosystems) and 0.2 ul Liz-500 size standards (Applied Biosystems). 
The mixture was then denatured at 95?C for 5 min, placed on ice, and separated on an ABI 
3730xl sequencer (Applied Biosystems). Allele sizes and genotypes were analyzed using the 
GeneMapper? Software v. 4.1 (Applied Biosystems). 


Data analyses 


To validate the markers, selected genetic diversity measures, such as number of alleles, observed 
heterozygosity (Но; Weir, 1990), and expected heterozygosity (Hg; Weir, 1990), were calculated 
per population for each locus using GenAIEx ver. 6.5 (Peakall & Smouse, 2012). Е; (Hartl & 
Clark, 1997), which is a measure of the deficiency in heterozygosity resulting from non-random 
mating, was estimated per population for each locus and also for each population for all loci using 
СепАІЕх ver. 6.5 (Peakall & Smouse, 2012). Allelic richness (АК) standardized for variation 
in sample size—was calculated per population for each locus using FSTAT 2.9.3.2 (Goudet, 
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2001). Genotypic linkage disequilibrium (LD) between all pairs of loci, as well as deviation of 
genotypic frequencies from the Hardy—Weinberg equilibrium (HWE) per population per locus, 
were tested using GENEPOP Web ver. 4.2 (Raymond & Rousset, 1995; Rousset, 2008) with 
the Markov-chain approach modified from Guo and Thompson (1992) using 10,000 steps of 
dememorization and iteration. The 95% significance levels for HWE and linkage disequilibrium 
(LD) tests were adjusted using a Bonferroni correction (Rice, 1989). The fixation index (FsT) 
(Weir & Cockerham, 1984) between all pairs of populations was estimated based on the infinite 
allele model of mutation using Arlequin v. 3.5 (Excoffier & Lischer, 2010). The significance of 
the Fsr between all pairs of populations was calculated using Fisher’s exact test based on 10,000 
permutations. Principal coordinates analysis (PCoA) via covariance with standardization of the 
population genetic distances was performed to detect and plot the relationships between popula- 
tions using GenAIEx ver. 6.5 with default parameters (Peakall & Smouse, 2012). STRUCTURE 
ver. 2.3.3 (Pritchard, Stephens, & Donnelly, 2000) was employed to identify the true number of 
subgroups (K) using the method described by Evanno, Regnaut, & Goudet (2005). An admixture 
model with correlated allele frequencies was used, with the K-value ranging from 1 to 10. Ten 
independent runs were performed for each K-value, with a burn-in period of 10,000 iterations, 
followed by 50,000 iterations for data collection. The structure result was visualized using the 
web-based tool, STRUCTURE HARVESTER ver. 0.6.8 (Earl & von Holdt, 2012). 


Results and discussion 


Sequencing and microsatellite selection 


The sequencing of 150bp paired-end reads from the Illumina library resulted in a total of 
160,044,678 reads (Table 2). A total of 251,215 assembled scaffolds with an average length of 
3,192.18 bp were retained for microsatellite marker identification (Table 2). Trinucleotide repeats 
were the most abundant class of microsatellites (45,160 regions) in the L. angelina genome, fol- 
lowed by dinucleotide (11,536 regions), tetranucleotide (5089 regions), pentanucleotide (201 
regions), and hexanucleotide (94 regions) repeats (Table 2). Sequencing data were deposited 
in the Sequence Read Archive of GenBank (accession number SRR8432568). After testing the 
candidate microsatellites for the availability of primer sites, amplification efficiency, degree of 
polymorphism, and specificity for target loci, 10 markers were eventually selected for use in 
subsequent genotyping (Table 1). 

The analysis of 43 genotyped L. angelina samples from three South Korean populations (20, 
10, and 13 L. angelina from Ansan, Gwangmyoung, and Seocheon, respectively) revealed that 
the observed number of alleles at each locus ranged from 6 to 18, and availability ranged from 
0.93 to 1 (Table 1). Tests of genotypic LD showed no significant association of alleles among 
the 10 loci after applying Bonferroni correction, indicating that all loci can be considered as 
independent markers. The GenBank accession numbers of the 10 loci are listed in Table 1. At 
each locus level, the observed number of alleles, Но, and Hg were 4—13, 0.211—0.950, and 
0.659—0.871, respectively, in the L. angelina samples at Ansan, which had the largest sample size 
(20 L. angelina samples; Table 3), thereby validating the suitability of the markers for population 
analyses. In the samples from Ansan, six of the 10 loci showed significant deviation from the 
Hardy- Weinberg equilibrium. 


Population genetic analysis 


The allelic patterns across populations indicated an absence of obvious differences among pop- 
ulations in terms of mean total number of alleles per population, number of effective alleles, 
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Table 2. Summarized statistics of the Illumina NextSeq 500 paired-end read 
sequence data, de novo assembly, filtered scaffolds, and perfect microsatellite loci 


of the Libellula angelina genome. 


Sequencing data summary 





Platform NextSeq 500 
Library type Paired-end 
Read length 150 
Number of reads 160,044,678 
Total bp 24,166,746,378 





Assembled genome summary 





Number of scaffolds 579,606 
N50 5146 
N80 1804 
N90 751 
Longest (shortest) scaffold bp 91,208 (100) 
GC level 35.3096 
Scaffold bp 915,213,354 
Scaffold average length 1579.03 





Scaffolds (> 10 x , > 9596 coverage) 











Number of scaffolds 251,215 
N50 6006 
Scaffold bp 801,923,072 
Average coverage 31.33 x 
Scaffold average length 3192.18 
Perfect microsatellite sequences 62,080 
Dinucleotides 11,536 
Trinucleotides 45,160 
Tetranucleotides 5089 
Pentanucleotides 201 
Hexanucleotides 94 


Ho, and Hg (Figure 2). Within-population gene diversity, which corresponds to Hg in diploid 
data, ranged from 0.784 to 0.815 (Figure 2). Lower Ho than Hg and a positive estimate of Fis, 
which is an evidence for the existence of inbreeding, were detected in all populations (Figure 2). 
The PCoA based on the first principal coordinate showed that the L. angelina population at 
Seocheon (the most distantly located region—at least 135 km—from the other two populations) 
showed a substantial divergence that accounted for 94.79% of the variation (Figure 3). Further- 
more, the remaining two populations at Ansan and Gwangmyoung (located only 35 km from 
each other), did not form an immediately close group based on the second principal com- 
ponent, which accounted for 5.21% of the variation (Figure 3). Pairwise Fsr estimates also 
supported the result of PCoA analysis, highlighting the significant genetic differentiation of the 
L. angelina population at Seocheon from that at Ansan and Gwangmyoung, whereas no sig- 
nificant genetic difference was found between the L. angelina populations at Ansan and 
Gwangmyoung (Figure 1). An examination of the likelihood scores from 10 replicate runs across 
K-values from 1 to 10 indicated that the optimal K-value was 2, suggesting the presence of two 
genetic groups (Figure 4). The assignment results from K — 2 showed that both gene pools 
were found in all populations, although the frequency of each gene pool in each population dif- 
fered. This result indicates that the three populations of L. angelina shared the same gene pools, 
although the PCoA апа Fsr supported the isolation of L. angelina population at Seocheon from 
the two remaining populations. 


Table 3. Characteristics of 10 polymorphic microsatellite loci developed from three populations of L. angelina. 


Marker Ansan Gwangmyoung Seocheon 





n a AR Ho Hg Fis HWE? n AR Ho Hg Fis HWE? n a AR Но Hg Fis HWE* 


a 





LA1398 20 6 4.94 0.211 0.698 0.698 0.000* 10 
LA99984 20 12 898 0.650 0.824 0.211 0.005* 10 
LA315 20 7 5.73 0.950 0.760  -0.250 0.072 10 


6.00 0.500 0.755 0.338 0.003* 13 
9.00 0.700 0.715 0.021 0.164 13 
5.00 1.000 0.765 -0.307 0.576 13 


6.62 0.077 0.731 0.895 0.000* 
7.81 0.385 0.855 0.550 0.000* 
5.94 0923 0.805  -0.147 0.305 


LA134449 20 12 9.09 0.500 0.870 0.425 0.000* 10 12 12.00 0.600 0.905 0.337 0.002* 13 11 10.08 0.769 0.882 0.128 0.113 
LA70081 19 4 3.78 0.421 0.659 0.361 0.009 10 4 4.00 0.300 0.705 0.574 0.003* 11 4 4.00 0.364 0.694 0.476 0.004* 
LA18063 20 11 8.33 0.500 0.834 0.400 0.000* 10 7 7.00 0.200 0.845 0.763 0.000* 11 8 7.81 0.091 0.831 0.891 0.000* 
LA3937 20 11 8.11 0.750 0.840 0.107 0.098 10 8 8.00 0.800 0.800 0.000 0.316 13 10 9.12 0.846 0.849 0.003 0.644 
LA47390 19 8 644 0.263 0.785 0.665 0.000* 10 6 6.00 0.500 0.730 0.315 0.034 13 9 8.43 0.385 0.858 0.552 0.000* 
LA27 20 13 9.55 0.632 0.850 0.257 0.007 10 8 8.00 0.700 0.795 0.119 0.372 13 8 7.35 0.538 0.772 0.303 0.046 
ГА747 20 12 9.65 0.600 0.871 0.311 0.002* 10 7 7.00 0.300 0.820 0.634 0.000* 13 11 9.78 0.462 0.873 0.471 0.000* 

6 7 

9 8 

3 6 


n, number of tested individuals; a, number of observed alleles; AR, allelic richness; Ho, observed heterozygosity; Hg, expected heterozygosity; and Fıs, inbreeding coefficient. ^ Significant deviation from 
Hardy-Weinberg equilibrium after a Bonferroni correction (*P = 0.05/10 = < 0.005). 
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Figure 2. Тһе mean allelic patterns of 10 loci of L. angelina across three populations. Na, number of different alleles; 
Na Freq. > 596, number of alleles with frequency greater than 5%; Ме, number of effective alleles; I, Shannon's informa- 
tion index; No. Private Alleles, number of alleles unique to a single population; No. LComm Alleles ( < 25%), number 
of locally common alleles occurring in < 25% of the populations; No. LComm Alleles (< 50%), number of locally 
common alleles occurring in < 50% of the populations; Hg, expected heterozygosity; Ho, observed heterozygosity; 
and Fis, inbreeding coefficient. 
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Figure 3. Principal coordinate analysis (PCoA) of three populations based on allelic variance of 10 microsatellite loci. 
The percentage variations explained by the first and second axes are indicated. 


In conclusion, a suite of polymorphic microsatellite markers specific to L. angelina was devel- 
oped using a next-generation sequencing technique. Considering the results presented in this 
study, these markers will be useful for studies on the genetic structure of undiscovered South 
Korean and Asian populations of L. angelina. This is particularly relevant considering the lim- 
ited number of previous population genetic studies for this endangered species (e.g. Takahashi 
et al., 2009). Although the results of this study are based on a limited number of samples, lower 
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Figure 4. Cluster analysis of multilocus microsatellite data of L. angelina performed using STRUCTURE software. 
(A) Plot of Delta K calculated with the formula Delta K = mean (|L’’(K)|)/sd(L(K)), n = 43. (B) Bar plot of estimated 
membership of each individual in K = 2 clusters. Black vertical bars separate the three populations. Different colors 
represent different gene pools. 


Ho than Hg, positive estimates of Fıs, and the substantial isolation of the L. angelina popula- 
tion at Seocheon from the other two populations (besides the lesser distance between the two 
remaining populations) collectively suggest that the South Korean populations of L. angelina 
consist of small, more or less isolated, and inbreeding populations. This result is consistent with 
field observations based on which the species was classified as endangered. Nevertheless, the 
shared gene pools in all populations indicates that the isolation of the L. angelina population 
at Seocheon from the other populations may not be sufficient for creating an independent gene 
pool. As more samples are being collected from different regions of Asian countries, including 
South Korea, more thorough data analyses of population genetics will be possible. 
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